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Abstract 

No significant general-purpose method is currently 
available to mechanically transform system require- 
ments into a provably equivalent model The widespread 
use of such a method represents a necessary step to- 
ward high-dependability system engineering for numer- 
ous application domains. Current tools and methods that 
start with a formal model of a system and mechanically pro- 
duce a provably equivalent implementation are valuable but 
not sufficient. The “gap” unfilled by such tools and meth- 
ods is that the formal models cannot be proven to be 
equivalent to the requirements . We offer a method for me- 
chanically transforming requirements into a provably 
equivalent formal model that can be used as the ba- 
sis for code generation and other transformations. This 
method is unique in offering full mathematical tractabil- 
ity while using notations and techniques that are well 
known and well trusted. Finally, we describe further ap- 
plication areas we are investigating for use of the ap- 
proach. 
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!• Introduction 

Development of a system that will have a high level of re- 
liability requires the developer to represent the system as a 
formal model that can be proven to be correct. Through the 
use of currently available tools, the model can then be auto- 
matically transformed into code with minimal or no human 
intervention to reduce the chance of inadvertent insertion 
of errors by developers. Automatically producing the for- 
mal model from customer requirements would further re- 
duce the chance of insertion of errors by developers. 

The need for ultra-high dependability systems increases 
continually, along with a correspondingly increasing need 


to ensure correctness in system development. By “correct- 
ness”, we mean that the implemented system is equivalent 
to the requirements, and that this equivalence can be proved 
mathematically. 

Available system development tools and methods that 
are based on formal models provide neither automated gen- 
eration of the models from requirements nor automated 
proof of correctness of the models. Hence, today there is 
no automated means to produce a system or a procedure 
that is a provably correct implementation of the customer’s 
requirements. Further, requirements engineering as a disci- 
pline has yet to produce an automated, mathematics-based 
process for requirements validation. 

2. Problem Statement 

Automatic code generation from requirements has been 
the ultimate objective of software engineering almost since 
the advent of high-level programming languages, and calls 
for a “requirements-based programming” capability have 
become deafening [9]. Several tools and products exist in 
the marketplace for automatic code generation from a given 
model. However, they typically generate code, portions of 
which are never executed, or portions of which cannot be 
justified from either the requirements or the model. More- 
over, existing tools do not and cannot overcome the funda- 
mental inadequacy of all currently available automated de- 
velopment approaches, which is that they include no means 
to establish a provable equivalence between the require- 
ments stated at the outset and either the model or the code 
they generate. 

Traditional approaches to automatic code generation, in- 
cluding those embodied in commercial products such as 
Matlab [20], in system development toolsets such as the B- 
Toolkit [19] or the VDM++ toolkit [17], or in academic re- 
search projects, presuppose the existence of an explicit (for- 
mal) model of reality that can be used as the basis for sub- 
sequent code generation, as shown in Figure 1 (a). While 
such an assumption is reasonable, the advantages and disad- 



(a) traditional development process 



Figure 1. (a) Traditional software develop- 
ment process from requirements to code, 
and (b) reverse engineering from code to a 
system description. 


vantages of the various modeling approaches used in com- 
puting are well known and certain models can serve well 
to highlight certain issues while suppressing other less rel- 
evant details [22]. It is clear that the converse is also true. 
Certain models of reality, while successfully detailing many 
of the issues of interest to developers, can fail to capture 
some important issues, or perhaps even the most important 
issues. Existing reverse-engineering approaches suffer from 
a similar plight. Typically (see Figure 1 (b)), a model is 
extracted from an existing system and is then represented 
in various ways, for example as a digraph [21]. The re- 
engineering process then involves using the resulting rep- 
resentation as the basis for code generation, as above. 

2.1. Specifications, Models, and Designs 

The model on which automatic code generation is 
based is referred to as a design, or more correctly, a de- 
sign specification. There is typically a mismatch be- 
tween the design and the implementation (sometimes 
termed the “specification-implementation gap”), in that 
the process of going from a suitable design to an imple- 
mentation involves many practical decisions that must 
be made by the automated tool used for code genera- 
tion without any clear-cut justifications, other than the 
predetermined implementation decisions of the tool de- 
signers. There is a more problematic “gap”, termed the 
“analysis-specification gap ” that emphasizes the prob- 
lem of capturing requirements and adequately representing 
them in a specification that is clear, concise, and com- 



Figure 2. The R2D2C approach, generating a 
formal model from requirements and produc- 
ing code from the formal model, with auto- 
matic reverse engineering. 


plete. Unless the specification is formal, proof of cor- 
rectness is impossible [1]. Unfortunately, many are reluc- 
tant to embrace formal specification techniques, believ- 
ing them to be difficult to use and apply [2] [7], despite 
many industrial success stories [11] [12] [13] [24]. Our ex- 
perience at NASA Goddard Space Flight Center (GSFC) 
has been that while engineers are happy to write de- 
scriptions as natural language scenarios, or even using 
semi-formal notations such as Unified Modeling Lan- 
guage (UML) use cases, they are loath to undertake formal 
specification. 

2.2. A Novel Approach 

The approach described herein, provisionally named 
R2D2C (“Requirements to Design to Code”), provides 
mathematically tractable round-trip engineering for sys- 
tem development. 

In this approach, engineers (or others) may write require- 
ments as scenarios in constrained (domain-specific) natural 
language, or in a range of other notations (including UML 
use cases). These will be used to derive a formal model (Fig- 
ure 2) that is guaranteed to be equivalent to the requirements 
stated at the outset, and which will subsequently be used as 
a basis for code generation. The formal model can be ex- 
pressed using a variety of formal methods. Currently we are 
using CSP, Hoare’s language of Communicating Sequen- 
tial Processes [15] [16], which is suitable for various types 
of analysis and investigation, and as the basis for fully for- 
mal implementations as well as automated test case genera- 
tion, etc. 

R2D2C is unique in that it allows for full formal devel- 
opment from the outset, and maintains mathematical sound- 
ness through all phases of the development process, from re- 
quirements through to automatic code generation. The ap- 
proach may also be used for reverse engineering, that is, 
in retrieving models and formal specifications from exist- 
ing code (Figure 2). The method can also be used to “para- 
phrase” (in natural language, etc.) formal descriptions of ex- 
isting systems. In addition, the approach is not limited to 



generating executable code. It may also be used to generate 
business processes and procedures, and we are currently ex- 
perimenting with using it to generate instructions for robotic 
devices to be used on the Hubble Robotic Servicing Mission 
(HRSM). We are also experimenting with using it as a ba- 
sis for an expert system verification tool, and as a means of 
capturing expert knowledge for expert systems. Such poten- 
tial applications will be described in Section 4. 

3. Technical Approach 

Section 3.1 describes R2D2C at a relatively high level. 
Section 3.2 describes an intermediate version of the ap- 
proach for which we have built a prototype tool [23], and 
with which we have successfully undertaken some exam- 
ples. 

3.1. R2D2C 

The R2D2C approach involves a number of phases, 
which are reflected in the system architecture described in 
Figure 3. The following describes each of these phases. 

D1 Scenarios Capture: Engineers, end users, and others 
write scenarios describing intended system operation. 
The input scenarios may be represented in a con- 
strained natural language using a syntax-directed ed- 
itor, or may be represented in other textual or graphi- 
cal forms. 

D2 Traces Generation: Traces and sequences of atomic 
events are derived from the scenarios defined in Dl. 

D3 Model Inference: A formal model, or formal specifi- 
cation, expressed in CSP is inferred by an automatic 
theorem prover - in this case, ACL2 [18] - using the 
traces derived in phase 2. A deep 1 embedding of the 
laws of concurrency [14] in the theorem prover gives 
it sufficient knowledge of concurrency and of CSP to 
perform the inference. The embedding will be the topic 
of a future paper. 

D4 Analysis: Based on the formal model, various analy- 
ses can be performed, using currently available com- 
mercial or public domain tools, and specialized tools 
that are planned for development. Because of the na- 
ture of CSP, the model may be analyzed at different 
levels of abstraction using a variety of possible imple- 
mentation environments. This will be the subject of a 
future paper. 

D 5 Code Generation: The techniques of automatic code 
generation from a suitable model are reasonably well 
understood. The present modeling approach is suitable 


1 “Deep” in the sense that the embedding is semantic rather than merely 
syntactic. 



Figure 3. The entire process with Dl thru D5 
illustrating the development approach and 
R1 thru R4 the reverse engineering. 


for the application of existing code generation tech- 
niques, whether using a tool specifically developed for 
the purpose, or existing tools such as FDR [5], or con- 
verting to other notations suitable for code generation 
(e.g., converting CSP to B [3] and then using the code 
generating capabilities of the B Toolkit). 



Figure 4. Reverse engineering of system us 
ing R2D2C. 


It should be re-emphasized that the “code” that is gen- 
erated may be code in a high-level programming lan- 
guage, low-level instructions for (electro-) mechanical 
devices, natural-language business procedures and instruc- 















tions, or the like. As Figure 4 illustrates, the above process 
may also be run in reverse: 

R1 Model Extraction : Using various reverse engineering 
techniques [25], a formal model expressed in CSP may 
be extracted. 

R2 Traces Generation: The theorem prover may be used 
to automatically generate traces based on the laws of 
concurrency and the embedded knowledge of CSP. 

R3 Analysis: Traces may be analyzed and used to check 
for various conditions, undesirable situations arising, 
etc. 

R4 Paraphrasing: A description of the system (or system 
components) may be retrieved in the desired format 
(natural language scenarios, UML use cases, etc.). 

Paraphrasing, whereby more understandable descrip- 
tions (above and beyond existing documentation) of ex- 
isting systems or system components are extracted, is 
likely to have useful application in future system mainte- 
nance for systems whose original design documents have 
been lost or systems that have been modified so much that 
the original design and requirements document do not re- 
flect the current system. 

3.2. Short-cut R2D2C 

The approach described in Section 3.1 is the way that 
R2D2C is intended to be applied, from requirements speci- 
fication through to code generation. However, the approach 
requires significant computing power in the form of an au- 
tomated theorem prover performing significant inferences 
based on traces input and on its “knowledge” of the laws of 
concurrency. While this is well warranted for certain appli- 
cations, it is likely to be beyond the resources of many de- 
velopers and organizations. As a practical concession, we 
also define a reduced version of R2D2C called the short- 
cut version (Figure 5), whereby the use of a theorem prover 
is avoided, yet without sacrificing high confidence in the va- 
lidity of the approach. The following describes each of the 
phases for the shortcut R2D2C: 

51 Scenarios Capture: As before, intended system behav- 
ior is described by scenarios input in natural language 
or an appropriate graphical or semi-formal notation. 

52 Translation to Intermediate Notation: Scenarios are 
translated to an intermediate notation, termed EzyCSP, 
which is a simple natural language-like subset of CSP 
that can be used to describe a large number of situa- 
tions and scenarios (recall that scenarios are domain 
specific). 

53 Analysis: While far more simple than CSP, EzyCSP al- 
lows some simple analyses to be performed. 


S4 Implementation in Java: EzyCSP is sufficiently simple 
that it may easily be translated to Java and executed. 

This simplified or short-cut approach clearly has signif- 
icant disadvantages when compared to our full approach. 
Firstly, the correctness of the development process is con- 
tingent on the correctness of both the translation of scenar- 
ios to the intermediate (EzyCSP) notation and the transla- 
tion of EzyCSP to Java. However, the correctness of the 
translators for these is assured via a proof of correctness 
undertaken with the ACL2 theorem prover. Secondly, we 
do not have a reverse process suitable to support reverse 
and (ultimately) re-engineering, for free. However, a Java- 
to-EzyCSP translator would certainly be possible for highly 
constrained subsets of Java. 

The significant advantage of this simplified approach, 
however, is that although a proof of correctness involving 
a theorem prover is still required, this is required exactly 
once and would be performed by the support system devel- 
opers (presumably expert in the art). This is significantly 
less expensive computationally than using a theorem prover 
in the development of each individual application. 



Figure 5. Short cut R2D2C. 


4. Application Areas 

The motivation for this work was the need for 
requirements-based programming for ultra high depend- 
ability systems. The method described in this paper is 
applicable in a number of areas, with potentially signifi- 
cant value in the following: 

Sensor Networks 

NASA is currently conducting research and development 
on sensor networks for planetary and solar system explo- 
ration as well as to support its Mission to Planet Earth. 









An example of a sensor network for solar system explo- 
ration is the Autonomous Nano Technology Swarm mis- 
sion (ANTS) [4], which is at the concept development 
phase. This mission will send 1,000 pico-class (approxi- 
mately 1 kg) spacecraft to explore the asteroid belt The 
ANTS spacecraft will act as a sensor network making obser- 
vations of asteroids and analyzing their composition. Sensor 
networks are also being considered for planetary (e.g., Mar- 
tian) exploration, to yield scientific information on weather 
and geology. For the Mission to Planet Earth, sensor net- 
works are already being researched and developed towards 
capabilities for early warnings about natural disasters and 
climate change. 

Projected NASA sensor networks are highly distributed 
autonomous “systems of systems” that must operate with a 
high degree of reliability. The solar system and planetary 
exploration networks will necessarily experience long com- 
munications delays with Earth, will partly and occasionally 
be out of touch with the Earth and mission control for long 
periods of time, and must operate under extremes of dy- 
namic environmental conditions. Due to the complexity of 
these systems as well as their distributed and parallel nature, 
they will have an extremely large state space and will be 
impossible to test completely using traditional testing tech- 
niques. The more “code” or instructions that can be gener- 
ated automatically from a verifiably correct model, the less 
likely that human developers will introduce errors. In addi- 
tion, the higher the level of abstraction that developers can 
work from, as is afforded through the use of scenarios to de- 
scribe system behavior, the less likely that a mismatch will 
occur between requirements and implementation and the 
more likely that the system can be validated. Working from 
a higher level of abstraction will also allow errors in the 
system to be more easily caught, since developers can bet- 
ter see the “big picture” of the system. In addition to allow- 
ing complex systems developers to work at a higher level of 
abstraction, R2D2C also converts the scenarios into a for- 
mal model that can be analyzed for concurrency-related er- 
rors and consistency and completeness, as well as domain- 
specific errors. 

Expert Systems 

We have been studying the potential use of this approach 
in the development, maintenance, and verification of expert 
systems. In particular, we have been giving consideration 
to using the R2D2C method in verifying the expert system 
used in the NASA ground control center for the POLAR 
spacecraft, which performs multi-wavelength imaging of 
the Earth’s aurora. The POLAR ground control expert sys- 
tem has rules written in the production system CLIPS [6] for 
automated “lights out” (untended) operation of the space- 
craft. A suitable translator from CLIPS (rather than natu- 
ral language) to CSP (or EzyCSP) enables us to use this 
technology to examine existing expert system rule bases for 


consistency, etc. What has proven to be of great interest, 
however, is the ability to generate CLIPS rules from CSP (or 
EzyCSP), just as we would generate code in Java or C++. 
POLAR ground control center personnel expect this would 
be a great benefit because it would give them a means of 
capturing expert knowledge, from natural language descrip- 
tion through to CLIPS rules, while maintaining correctness, 
which heretofore has not been available. 

Robotic Operations 

As pointed out earlier, the “code” generated by this ap- 
proach need not be specifically code in a programming lan- 
guage, and we have been experimenting with generating 
code to control robots. Perhaps more interesting is the use 
of this approach to investigate the validity and correctness 
of procedures for complex robotic assembly or repair tasks 
in space. We have begun exploratory work in this direction, 
to provide an additional means to validate procedures from 
the Hubble Robotic Servicing Mission (HRSM) - for exam- 
ple, the procedures for replacement of cameras on the Hub- 
ble Space Telescope (HST). 

5. Related Work 

Harel [8] [10] has advocated scenario-based program- 
ming through UML use cases and play-in scenarios. The 
present work differs in that it uses scenarios in the form 
of structured text that is easily understandable by engineers 
and non-engineers. In addition, the results of converting the 
structured text to traces and then from traces to a formal 
model allows us to use a wide range of formal methods tools 
(e.g., model checkers), which can be used to verify and val- 
idate the system. 

NASA Ames has been working on the automatic trans- 
lation of UML use cases to executable code, and report suc- 
cess in using the approach on large applications [26]. Our 
approach is different, however, in that we are not limited 
to UML use cases, nor to natural language. R2D2C ac- 
commodates any input mechanism whereby requirements 
can be represented as scenarios, and traces extracted. Our 
approach works equally well with graphical, mathemati- 
cal, and textual requirements representations. More impor- 
tantly, the key to our approach and what makes it invalu- 
able for high-dependability applications is the full formal 
basis, and complete mathematical tractability from require- 
ments through to code. To our knowledge, no other cur- 
rently available automated development methodology can 
make this claim. 

6. Conclusions and Future Work 

R2D2C is a unique approach to the automatic derivation 
of ultra-high dependability systems. It is unique in that it 
supports fully (mathematically) tractable development from 


requirements elicitation through to automatic code genera- 
tion (and back again). While other approaches have sup- 
ported various subsets of the development lifecycle, there 
has been heretofore a “jump” in deriving from the require- 
ments the formal model that is a prerequisite for sound au- 
tomatic code generation. Yet, R2D2C is a simple approach, 
combining techniques and notations that are well under- 
stood, well tried and tested, and trusted. The novelty of the 
approach, and the part of the approach that achieves conti- 
nuity in the development process, is the use of a theorem 
prover to reverse the laws of concurrency, and to achieve 
levels of inference that would be impossible for a human 
being to perform on all but trivial systems. 

R2D2C (and other approaches that similarly provide 
mathematical soundness throughout the development life- 
cycle) will decrease costs and delays for the engineering 
(and re-engineering) of ultra-high dependability systems 
through automated development. Such technology will dra- 
matically increase assurance of system success by ensuring 
that requirements are complete and consistent, implementa- 
tions are true to the requirements, automatically coded sys- 
tems are bug-free, and implementation behavior is as ex- 
pected. 

Future work will include improving the quality of the 
embedding of CSP in ACL2, and optimizing that for effi- 
ciency. We plan a plethora of support tools to allow us to 
easily change the level of abstraction in a formal model, to 
visualize various system models and changes in those mod- 
els, and to aid in tracking changes through the development 
process (or the reverse engineering process). We plan to en- 
hance our existing prototype to support the full version of 
R2D2C, to make it into a fully functional robust proto- 
type, and to apply it to significant examples. 
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